December 11, 2015
Raw data given as excel file with ~300 tabs like this:
Along with timeseries measurements, the following static covariates were given for each patient:
As well as an outcome score called "GOS" measured at 3, 6, 12, and 24 months:
For the sake of modeling, the following assumptions were made:
There were 339 patients in raw data but only 268 in filtered dataset.
Note: Of 268 patients in filtered set, 18 had missing 3 month GOS
Frequency of non-time-dependent values:
Relationships between static variables and outcomes are pretty clear:
Timeseries measurments for 4 random patients:
## Other Stuff| Estimate | Uncond. variance | Nb models | Importance | +/- (alpha=0.05) | |
|---|---|---|---|---|---|
| pha_0_7.35 | -0.0372521 | 0.0069486 | 29 | 0.2454711 | 0.1641506 |
| pao2_100_inf | -0.0416354 | 0.0076650 | 35 | 0.2862554 | 0.1724048 |
| paco2_0_35 | -0.1039550 | 0.0256525 | 52 | 0.4475231 | 0.3153981 |
| pbto2_0_20 | -0.1391531 | 0.0323655 | 52 | 0.5358749 | 0.3542705 |
| sex | 0.1722509 | 0.0412554 | 52 | 0.5767763 | 0.3999763 |
| marshall | -0.2438273 | 0.0502777 | 65 | 0.7168432 | 0.4415518 |
| paco2_45_inf | 0.2923234 | 0.0487595 | 77 | 0.8269899 | 0.4348341 |
| pha_7.45_inf | -0.3193026 | 0.0514109 | 79 | 0.8379189 | 0.4464999 |
| pbto2_100_inf | -0.8610325 | 0.3929205 | 98 | 0.9895500 | 1.2343722 |
| (Intercept) | -1.5639168 | 0.0472386 | 100 | 1.0000000 | 0.4279987 |
| age | -0.7251374 | 0.0402352 | 100 | 1.0000000 | 0.3949996 |
| gcs | 0.5393797 | 0.0343400 | 100 | 1.0000000 | 0.3649167 |
| icp1_20_inf | -0.6818782 | 0.1017011 | 100 | 1.0000000 | 0.6279955 |
| pao2_0_30 | -0.7211150 | 0.1144418 | 100 | 1.0000000 | 0.6661716 |
## Warning: package 'stringr' was built under R version 3.1.3
A modified model:
\[ logit(y_i) = \alpha + \beta \cdot X_i + f(G_{ij}) \]
where
\[ X_i = [Gender_i, Age_i, CommaScore_i, MarshallScore_i] \]
and
\[ f(G_i) = \frac{1}{n_i} \sum_j{ \frac{c_1}{1 + e^{-c_2(G_{ij} - c_3)}} + \frac{c_4}{1 + e^{-c_5(G_{ij} - c_6)}} } \] \[ n_i = \text{ length of timeseries for patient }i \]
These are functions drawn from the priors in the model and show all possibilities:
By semi-simulated, I mean by taking the real data and hard coding coefficient / function values.